{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Regression with Outliers\n",
    "\n",
    "In the standard __Gaussian process regression__ setting it is assumed that the observations are __Normally distributed__ about the latent function. In the package this can applied using either the `GP` or `GPE` functions with which *exact Gaussian process* models.\n",
    "\n",
    "One of the drawbacks of exact GP regression is that by assuming Normal noise the GP is __not robust to outliers__. In this setting, it is more appropriate to assume that the distribution of the noise is heavy tailed. For example, with a __Student-t distribution__,\n",
    "$$\n",
    "\\mathbf{y} \\ | \\ \\mathbf{f},\\nu,\\sigma \\sim \\prod_{i=1}^n \\frac{\\Gamma(\\nu+1)/2}{\\Gamma(\\nu/2)\\sqrt{\\nu\\pi}\\sigma}\\left(1+\\frac{(y_i-f_i)^2}{\\nu\\sigma^2}\\right)^{-(\\nu+1)/2}\n",
    "$$\n",
    "\n",
    "Moving away from the Gaussian likelihood function (i.e. Normally distributed noise) and using the Student-t likelihood means that we can no longer analytically calculate the GP marginal likelihood. We can take a Bayesian perspective and sample from the joint distribution of the latent function and model parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"\" />"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Load functions from packages\n",
    "using GaussianProcesses, Plots\n",
    "using Distributions:Normal, TDist\n",
    "using Klara:MALA\n",
    "\n",
    "#Simulate the data\n",
    "srand(112233)\n",
    "n = 20\n",
    "X = collect(linspace(-3,3,n))\n",
    "sigma = 1.0\n",
    "Y = X + sigma*rand(TDist(3),n);\n",
    "\n",
    "# Plots observations\n",
    "pyplot()\n",
    "scatter(X,Y;fmt=:png, leg=false)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We fit a standard (exact) Gaussian process model to the Student-t data and compare this against the Monte Carlo GP which is applicable for non-Gaussian observations models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GP Monte Carlo object:\n",
       "  Dim = 1\n",
       "  Number of observations = 20\n",
       "  Mean function:\n",
       "    Type: GaussianProcesses.MeanZero, Params: Float64[]\n",
       "  Kernel:\n",
       "    Type: GaussianProcesses.Mat32Iso, Params: [0.0, 0.0]\n",
       "  Likelihood:\n",
       "    Type: GaussianProcesses.StuTLik, Params: [0.1]\n",
       "  Input observations = \n",
       "[-3.0 -2.68421 … 2.68421 3.0]\n",
       "  Output observations = [-3.98707, -2.7855, -1.74722, -2.27384, -7.12662, -1.43191, -0.840899, 0.305618, -0.444213, -0.924319, -0.0303161, -0.368707, -0.706642, 2.07517, 5.68693, -1.02634, 3.25863, 5.46091, 4.45861, 4.60741]\n",
       "  Log-posterior = -77.863"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Build the models\n",
    "\n",
    "gpe = GPE(X,vec(Y),MeanZero(),Matern(3/2,0.0,0.0),0.5) #Exact GP assuming Gaussian noise\n",
    "\n",
    "l = StuTLik(3,0.1)\n",
    "gpmc = GPMC(X, vec(Y), MeanZero(), Matern(3/2,0.0,0.0), l) #Monte Carlo GP with student-t likelihood"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Estimate the parameters of the exact GP through maximum likelihood estimation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dict(:mean=>true,:kern=>true,:noise=>true)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "Results of Optimization Algorithm\n",
       " * Algorithm: L-BFGS\n",
       " * Starting Point: [0.5,0.0,0.0]\n",
       " * Minimizer: [0.6566151884189473,1.7118340256787345, ...]\n",
       " * Minimum: 4.578633e+01\n",
       " * Iterations: 11\n",
       " * Convergence: true\n",
       "   * |x - x'| < 1.0e-32: false\n",
       "   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false\n",
       "   * |g(x)| < 1.0e-08: true\n",
       "   * f(x) > f(x'): false\n",
       "   * Reached Maximum Number of Iterations: false\n",
       " * Objective Function Calls: 46\n",
       " * Gradient Calls: 46"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "optimize!(gpe)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Taking a Bayesian perspective, we can add prior distributions to the model parameters and sample from the posterior distribution using Markov chain Monte Carlo through the `mcmc` function. This function builds on the [Klara](https://github.com/JuliaStats/Klara.jl) package and a list of available samplers can be found through this package's documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "BasicMCJob:\n",
      "  Variable [1]: p (BasicContMuvParameter)\n",
      "  GenericModel: 1 variables, 0 dependencies (directed graph)\n",
      "  MALA sampler: drift step = 0.05\n",
      "  VanillaMCTuner: period = 100, verbose = false\n",
      "  BasicMCRange: number of steps = 49991, burnin = 10000, thinning = 10"
     ]
    },
    {
     "data": {
      "text/plain": [
       "23×4000 Array{Float64,2}:\n",
       " -1.44975    -1.44975    -1.02479   …  -1.43093    -1.23015   -1.23015 \n",
       "  0.588195    0.588195    0.332809      1.08002     1.17153    1.17153 \n",
       "  0.718085    0.718085    0.211443      0.493442    0.385335   0.385335\n",
       " -1.32609    -1.32609    -1.15774      -0.333319   -0.200674  -0.200674\n",
       " -0.721646   -0.721646   -0.855224     -0.55112    -0.794651  -0.794651\n",
       "  0.851406    0.851406    0.986268  …   1.32336     1.21235    1.21235 \n",
       " -0.250237   -0.250237   -0.296884     -0.298572   -0.166915  -0.166915\n",
       "  0.154486    0.154486    0.387819      0.0714513   0.162703   0.162703\n",
       " -0.0969148  -0.0969148  -0.324955     -0.446952   -0.267948  -0.267948\n",
       " -0.380338   -0.380338    0.133223      0.765023    0.42936    0.42936 \n",
       "  0.378311    0.378311    0.555143  …  -0.226185   -0.211079  -0.211079\n",
       " -0.918741   -0.918741   -0.744673     -0.589133   -0.436455  -0.436455\n",
       " -0.136525   -0.136525    0.123217      1.7611      1.83721    1.83721 \n",
       "  1.77061     1.77061     1.61585      -0.923194   -0.784206  -0.784206\n",
       "  1.73116     1.73116     1.87613       0.82272     0.86416    0.86416 \n",
       " -0.678761   -0.678761   -0.545683  …   0.386       0.503511   0.503511\n",
       "  0.830114    0.830114    0.782332      2.30027     2.27326    2.27326 \n",
       "  1.28389     1.28389     1.61891       2.18992     2.05585    2.05585 \n",
       "  0.247051    0.247051    0.492288     -0.479602   -0.876566  -0.876566\n",
       " -0.385056   -0.385056   -0.53043      -0.780789   -0.959797  -0.959797\n",
       " -0.372006   -0.372006   -0.116689  …  -0.138475   -0.17703   -0.17703 \n",
       " -0.0997188  -0.0997188  -0.338012      0.869903    1.11409    1.11409 \n",
       "  1.1369      1.1369      1.10389       1.05892     1.07458    1.07458 "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "set_priors!(gpmc.lik,[Normal(-2.0,4.0)])\n",
    "set_priors!(gpmc.k,[Normal(-2.0,4.0),Normal(-2.0,4.0)])\n",
    "\n",
    "samples = mcmc(gpmc;sampler=MALA(0.05),nIter=50000, thin=10, burnin=10000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"\" />"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Plot posterior samples\n",
    "xtest = linspace(minimum(gpmc.X),maximum(gpmc.X),50);\n",
    "\n",
    "#Set the parameters to the posterior values the sample random function\n",
    "fsamples = [];\n",
    "for i in 1:size(samples,2)\n",
    "    set_params!(gpmc,samples[:,i])\n",
    "    update_target!(gpmc)\n",
    "    push!(fsamples, rand(gpmc,xtest))\n",
    "end\n",
    "\n",
    "#Predict\n",
    "p1=plot(gpe,leg=false, title=\"Exact GP\")   #Exact GP (assuming Gaussian noise)\n",
    "\n",
    "sd = [std(fsamples[i]) for i in 1:50]\n",
    "p2=plot(xtest,mean(fsamples),ribbon=2*sd,leg=false, title=\"Monte Carlo GP\") #GP Monte Carlo with student-t noise\n",
    "scatter!(X,Y)\n",
    "\n",
    "plot(p1,p2;fmt=:png)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.6.0",
   "language": "julia",
   "name": "julia-0.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
